Dimension Reduction of Microarray Data with Penalized Independent Component Aanalysis
نویسندگان
چکیده
In this paper we propose to use ICA as a dimension reduction technique for microarray data. All microarray studies present a dimensionality challenge to the researcher: the number of dimensions (genes/spots on the microarray) is many times larger than the number of samples, or arrays. Any subsequent analysis must deal with this dimensionality problem by either reducing the dimension of the data, or by incorporating some assumptions in the model that effectively regularize the solution. In this paper we propose to use the ICA approach with a regularized whitening technqiue to reduce the dimension to a small set of independent sources or latent variables, which then can be used in downstream analysis. The elements of the mixing matrix can themselves be investigated to gain more understanding about the genetic underpinnings of the process that generated the data. While a number of researchers have proposed ICA as a model for the microarray data, this paper is different in an important aspect: we focus on ICA as a dimension reduction step which leads us to the generative model formulation that applies the ICA in an opposite way to most other proposals in this field.
منابع مشابه
Predictive model building for microarray data using generalized partial least squares model
Microarray technology enables simultaneously monitoring the expression of hundreds of thousands of genes in an entire genome. This results in the microarray data with the number of genes p far exceeding the number of samples n. Traditional statistical methods do not work well when n p. Dimension reduction methods are often required before applying standard statistical methods, popular among the...
متن کاملPenalized Bregman Divergence Estimation via Coordinate Descent
Variable selection via penalized estimation is appealing for dimension reduction. For penalized linear regression, Efron, et al. (2004) introduced the LARS algorithm. Recently, the coordinate descent (CD) algorithm was developed by Friedman, et al. (2007) for penalized linear regression and penalized logistic regression and was shown to gain computational superiority. This paper explores...
متن کاملPLS and SVD based penalized logistic regression for cancer classification using microarray data
Accurate cancer prediction is important for treatment of cancers. The combination of two dimension reduction methods, partial least squares (PLS) and singular value decomposition (SVD), with the penalized logistic regression (PLR) has created powerful classifiers for cancer prediction using microarray data. Comparing with support vector machine (SVM) on seven publicly available cancer datasets,...
متن کاملDeveloping a Filter-Wrapper Feature Selection Method and its Application in Dimension Reduction of Gen Expression
Nowadays, increasing the volume of data and the number of attributes in the dataset has reduced the accuracy of the learning algorithm and the computational complexity. A dimensionality reduction method is a feature selection method, which is done through filtering and wrapping. The wrapper methods are more accurate than filter ones but perform faster and have a less computational burden. With ...
متن کاملClassification using partial least squares with penalized logistic regression
MOTIVATION One important aspect of data-mining of microarray data is to discover the molecular variation among cancers. In microarray studies, the number n of samples is relatively small compared to the number p of genes per sample (usually in thousands). It is known that standard statistical methods in classification are efficient (i.e. in the present case, yield successful classifiers) partic...
متن کامل